Loading packages

library(bayesrules)
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
##   options(mc.cores = parallel::detectCores())
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
##    * Does _not_ affect other ggplot2 plots
##    * See ?bayesplot_theme_set for details on theme setting
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(broom.mixed)
library(tidybayes)

Exercise 11.1

We might want to build a regression model with more than one predictor in case just one predictor could not explain the full picture, or by using just the one picture would paint a misleading picture/would have a poor model (e.g. some models that have density plots with two modes, showing the two distributions of 2 different groups- the second predictor)

Exercise 11.2

a- There is no indicator for the Ford category because B0 (the intercept) accounts for the Ford category- when B1-3 are all 0, the value B0 will be the baseline, indicating the mpg for a Ford

b- B2 is the change in mpg when the make of a car is a Subaru

c- B0 is the mpg of a Ford (the “average” or “baseline” mpg)

Exercise 11.3

a- B0 is the weight of a “newborn” Mr. Stripey tomato (0 days old); B1 is the change in the tomato’s weight in grams by each day it has been growing, and B2 is the chance in the tomato’s weight in grams if it is a Roma tomato (as opposed to a Mr. Stripey tomato)

b- If B2 were 0, that would mean the tomato is a Mr. Stripey type tomato

Exercise 11.4

a- For X1 and X2 to interact, it means that there is a relationship between type of tomato and daily weight gain of the tomato. For instance, Roma tomatoes may grow at a faster rate per day than Mr. Stripey tomatoes.

b- B3 is the the change in the weight of the tomato in grams based on the relationship between daily weight gain and tomato type.

Exercise 11.5

knitr::include_graphics("/Users/madelynnwellons/Downloads/IMG_4150.JPG")

c- Two other ways are context and hypothesis tests

Exercise 11.6

a- It can be beneficial to add predictors to models because the new predictor may be able to account for more variance and explain the data better

b- It can be beneficial to remove predictors from models if the model is “too noisy”, or if predictors are essentially the same (e.g. if two predictors correlate heavily, the model would benefit from only one of them being present instead of both)

c- I might add the child’s gender, because at different ages girls and boys have their growth spurts, and boys in general have larger feet than girls after they have hit puberty

d- I might remove the indicator of whether a child knows how to swim, because that does not have anything directly to do with shoe size

Exercise 11.7

a- Qualities of a good model include having a good balance of bias and variance (not too many or too few predictors), being a fair model, being “less wrong” (lower levels of error), and having accurate posterior predictions

b- Qualities of a bad model include having too much bias or variance, not being a fair model, having high levels of error, and having inaccurate posterior predictions

Exercise 11.8

We can use visualizations to determine which model is better (e.g. plotting posterior predictions against data), cross-validation (the k-fold cross validation, where we train the model on portions of the data and test its’ accuracy on the rest of the data and compare the accuracy of each model), and ELPD (expected log-predictive densities, which calculates the accuracy of the posterior predictions of the models- we can again compare these and the higher ELPD means that model is “better”/more accurate).

Exercise 11.9

The bias-variance trade-off is essentially the trade-off you get when you add or subtract predictors from a model; when you add predictors, you increase the variance and subtract bias, and vice versa when you subtract predictors. This is important because this a good balance of bias and variance can give you a “good” model, in that it is not overfit to your specific dataset but it is not so loose as to not be helpful in predictions.

Exercise 11.10

# Alternative penguin data
penguin_data <- penguins_bayes %>% 
  filter(species %in% c("Adelie", "Gentoo"))

a- First we can plot the penguin data (with the species as colors since they are categorical variables):

ggplot(penguin_data, aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  geom_point() +
  geom_smooth(method = "lm")
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).

It looks like the Gentoo species of penguins have both higher body mass and higher flipper length on average than the Adelie penguins. There is also a positive relationship between flipper length and body mass.

b- Simulating the posterior normal regression model:

penguin_data <- na.omit(penguin_data)
penguins_model <- stan_glm(
  body_mass_g ~ flipper_length_mm + species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(4000, 250),
  chains = 4, iter = 10000, seed = 12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 8.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.86 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.438232 seconds (Warm-up)
## Chain 1:                0.472996 seconds (Sampling)
## Chain 1:                0.911228 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.400996 seconds (Warm-up)
## Chain 2:                0.463462 seconds (Sampling)
## Chain 2:                0.864458 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.8e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.427418 seconds (Warm-up)
## Chain 3:                0.428862 seconds (Sampling)
## Chain 3:                0.85628 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.439239 seconds (Warm-up)
## Chain 4:                0.426322 seconds (Sampling)
## Chain 4:                0.865561 seconds (Total)
## Chain 4:

c- Visually, we can do a pp check as well as another plot of the model (example “draws”:

pp_check(penguins_model)

penguin_data <- na.omit(penguin_data)

penguin_data |> 
  add_fitted_draws(penguins_model, n = 100) |> 
  ggplot(aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
    geom_line(aes(y = .value, group = paste(species, .draw)), alpha = .1) +
    geom_point(data = penguin_data, size = 0.1)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

In the first plot we can see that there is clearly something we are missing here (likely the interaction between species and another variable), as this is bimodal and appears to be like 2 distributions pushed together. In the second plot we see that the draws from the model are similar to those from the plot we made earlier in the problem using the data- the model is doing well in terms of acting like the data, but something here is not capturing the whole picture.

We can now numerically determine the 95% CI for each predictor coefficient:

# Posterior summaries
posterior_interval(penguins_model, prob = 0.8, 
                   pars = c("flipper_length_mm", "speciesGentoo"))
##                        10%       90%
## flipper_length_mm 37.66980  47.02044
## speciesGentoo     97.18318 377.73332

The flipper length value here is not too wide, but the species CI is extremely large!

d- Below is the tidy summary of the model:

tidy(penguins_model, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = .8)
## # A tibble: 5 × 5
##   term              estimate std.error conf.low conf.high
##   <chr>                <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)        -4353.     702.    -5238.    -3457. 
## 2 flipper_length_mm     42.4      3.70     37.7      47.0
## 3 speciesGentoo        238.     111.       97.2     378. 
## 4 sigma                391.      17.2     370.      414. 
## 5 mean_PPD            4325.      34.2    4282.     4369.

The flipper length coefficient here is 42.4, which means that on average, for every increase in 1 mm of flipper length, the penguins body mass will increase by 42.4 grams. The species coefficient here is 238, which means that on average, when a penguin is a Gentoo (as opposed to an Adelie), it weighs 238 grams more.

e- Below is the simulation of the posterior prediction and the plot:

# Simulate a set of predictions
set.seed(12345)
Adelie197_prediction <- posterior_predict(
  penguins_model,
  newdata = data.frame(flipper_length_mm = 197, 
                       species = "Adelie"))

# Plot the posterior predictive model
mcmc_areas(Adelie197_prediction) +
  xlab("Adelie 197 mm weight")

For this specific penguin, the model has estimated that the weight is around 4000 g, with other most likely values ranging between 3750 and 4250.

Exercise 11.11

a- Simulating the posterior for this model:

penguins_model2 <- stan_glm(
  body_mass_g ~ flipper_length_mm + species + flipper_length_mm:species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(4000, 250),
  chains = 4, iter = 10000, seed = 12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 3.47276 seconds (Warm-up)
## Chain 1:                3.85697 seconds (Sampling)
## Chain 1:                7.32973 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 3.37799 seconds (Warm-up)
## Chain 2:                4.57474 seconds (Sampling)
## Chain 2:                7.95273 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 3.58688 seconds (Warm-up)
## Chain 3:                3.70051 seconds (Sampling)
## Chain 3:                7.28739 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.5e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 3.38578 seconds (Warm-up)
## Chain 4:                4.5416 seconds (Sampling)
## Chain 4:                7.92738 seconds (Total)
## Chain 4:

b- Plots for the 50 posterior lines:

penguin_data |> 
  add_fitted_draws(penguins_model2, n = 50) |> 
  ggplot(aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
    geom_line(aes(y = .value, group = paste(species, .draw)), alpha = .1)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

This looks similar to the last plot we did to plot these few variables, which is a good sign; based on this plot knowing the context of the interaction term, I’m noticing that the Gentoo lines tend to have a steeper slope, indicating that tge Gentoo species may have a stronger relationships between flipper length and body mass.

c- Below is the tidy summary for the model:

tidy(penguins_model2, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = .8)
## # A tibble: 6 × 5
##   term                            estimate std.error conf.low conf.high
##   <chr>                              <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)                      -2859.     896.   -4007.     -1741. 
## 2 flipper_length_mm                   34.5      4.70    28.6       40.5
## 3 speciesGentoo                    -3294.    1325.   -4999.     -1585. 
## 4 flipper_length_mm:speciesGentoo     17.2      6.46     8.94      25.5
## 5 sigma                              386.      17.1    365.       408. 
## 6 mean_PPD                          4325.      33.4   4283.      4368.

I believe that, while not strong evidence, this does provide some evidence that the interaction terms are necessary for the model. For example, the confidence interval for the interaction coefficient is consistently above 0 and has a low standard error. However, it is apparent that this has wreacked a bit of havoc on the species coefficient- the SE here is much higher than before, so potentially the interaction term is also picking up some excess “static” from the overall species coefficient. Overall though, given the plots we saw earlier I would still say that this is a necessary addition.

Exercise 11.12

a- Simulating the posterior model:

penguins_model3 <- stan_glm(
  body_mass_g ~ flipper_length_mm + bill_length_mm + bill_depth_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(4000, 250),
  chains = 4, iter = 10000, seed = 12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.455492 seconds (Warm-up)
## Chain 1:                0.484762 seconds (Sampling)
## Chain 1:                0.940254 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.428702 seconds (Warm-up)
## Chain 2:                0.457836 seconds (Sampling)
## Chain 2:                0.886538 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.419389 seconds (Warm-up)
## Chain 3:                0.443013 seconds (Sampling)
## Chain 3:                0.862402 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.9e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.428232 seconds (Warm-up)
## Chain 4:                0.491432 seconds (Sampling)
## Chain 4:                0.919664 seconds (Total)
## Chain 4:

b- Below are the credible intervals for the model parameters:

posterior_interval(penguins_model3, prob = .8)
##                           10%         90%
## (Intercept)       -6851.97085 -5441.27103
## flipper_length_mm    28.66591    36.46592
## bill_length_mm       59.91385    80.94080
## bill_depth_mm        34.86143    68.68047
## sigma               321.45409   359.84261

c- All of these appear to have a significant association with body mass, and they all appear to be positive (none are below 0 at any point in the CI).

Exercise 11.13

a- Here we’ll have to simulate a few models; we’ve already simulated flipper length + species (the first model) and flipper/bill length + bill depth (the third model) but in terms of the problem those are the 3rd and 4th models respectively so I will go ahead and adjust their names to reflect that. I will then simulate the 1st and 2nd models (flipper and species individually).

#Renaming models
model3 <- penguins_model
model4 <- penguins_model3

#simulating model 1
model1 <- stan_glm(
  body_mass_g ~ flipper_length_mm,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(4000, 250),
  chains = 4, iter = 10000, seed = 12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.261191 seconds (Warm-up)
## Chain 1:                0.291902 seconds (Sampling)
## Chain 1:                0.553093 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.29646 seconds (Warm-up)
## Chain 2:                0.290745 seconds (Sampling)
## Chain 2:                0.587205 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.235238 seconds (Warm-up)
## Chain 3:                0.27314 seconds (Sampling)
## Chain 3:                0.508378 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.276047 seconds (Warm-up)
## Chain 4:                0.279694 seconds (Sampling)
## Chain 4:                0.555741 seconds (Total)
## Chain 4:
#Simulating model 2
model2 <- stan_glm(
  body_mass_g ~ species,
  data = penguin_data, family = gaussian, 
  prior_intercept = normal(4000, 250),
  chains = 4, iter = 10000, seed = 12345)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.234043 seconds (Warm-up)
## Chain 1:                0.278732 seconds (Sampling)
## Chain 1:                0.512775 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.234246 seconds (Warm-up)
## Chain 2:                0.294079 seconds (Sampling)
## Chain 2:                0.528325 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.214592 seconds (Warm-up)
## Chain 3:                0.284853 seconds (Sampling)
## Chain 3:                0.499445 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.2656 seconds (Warm-up)
## Chain 4:                0.274936 seconds (Sampling)
## Chain 4:                0.540536 seconds (Total)
## Chain 4:

b- Below are the pp check plots for the 4 models:

pp_check(model1)

pp_check(model2)

pp_check(model3)

pp_check(model4)

All of these plots have significant variance near the peak; I can’t determine which one is the exact “best” but I would say that the 4th one has the most overlap with y overall.

c- First we can remove the NA values:

penguins_complete <- penguins_bayes %>% 
  select(flipper_length_mm, body_mass_g, species, 
         bill_length_mm, bill_depth_mm) %>% 
  na.omit() 

Now we can use the 10-fold cross validation to compare the models:

prediction_summary_cv(model = model1, data = penguins_complete, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 345.5820  0.8673599 0.4571429 0.9428571
## 2     2 253.2693  0.6354342 0.5588235 0.9411765
## 3     3 204.0943  0.5106856 0.6176471 0.9705882
## 4     4 248.4275  0.6186864 0.5588235 1.0000000
## 5     5 287.2708  0.7211183 0.4705882 0.9705882
## 6     6 270.3711  0.6797468 0.4411765 0.9705882
## 7     7 244.8281  0.6151760 0.5588235 0.9411765
## 8     8 244.8779  0.6248588 0.5294118 0.9411765
## 9     9 222.6000  0.5607813 0.5588235 0.9411765
## 10   10 378.3286  0.9638993 0.4000000 0.9142857
## 
## $cv
##       mae mae_scaled within_50 within_95
## 1 269.965  0.6797747 0.5151261 0.9533613
prediction_summary_cv(model = model2, data = penguins_complete, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 347.1219  0.7508762 0.4000000 0.9142857
## 2     2 328.9793  0.7058783 0.5000000 0.9705882
## 3     3 370.3309  0.8011433 0.4411765 0.9411765
## 4     4 404.2902  0.8793566 0.4411765 0.9411765
## 5     5 310.6309  0.6601133 0.5294118 0.9705882
## 6     6 276.4787  0.5948840 0.5294118 0.9411765
## 7     7 309.8590  0.6561397 0.5294118 1.0000000
## 8     8 383.2264  0.8227089 0.4411765 0.9705882
## 9     9 364.4174  0.7833938 0.4411765 0.9411765
## 10   10 303.5640  0.6423370 0.5142857 0.9714286
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 339.8899  0.7296831 0.4767227 0.9562185
prediction_summary_cv(model = model3, data = penguins_complete, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 272.3401  0.7234557 0.4571429 0.9428571
## 2     2 215.0941  0.5710529 0.5588235 0.8823529
## 3     3 202.0770  0.5273814 0.5294118 1.0000000
## 4     4 263.2041  0.6995268 0.4705882 0.9411765
## 5     5 283.0534  0.7445974 0.4705882 0.9705882
## 6     6 195.9879  0.5143677 0.6176471 0.9117647
## 7     7 245.7978  0.6374846 0.5294118 1.0000000
## 8     8 315.0539  0.8298904 0.3823529 0.9705882
## 9     9 251.2961  0.6621675 0.5000000 0.9705882
## 10   10 278.4290  0.7428912 0.4857143 0.9142857
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 252.2333  0.6652816 0.5001681 0.9504202
prediction_summary_cv(model = model4, data = penguins_complete, k = 10)
## $folds
##    fold      mae mae_scaled within_50 within_95
## 1     1 335.0204  0.8510179 0.3428571 0.9142857
## 2     2 165.1987  0.4131759 0.6764706 0.9705882
## 3     3 250.5863  0.6288660 0.5294118 0.9117647
## 4     4 282.6148  0.7060228 0.4705882 0.9705882
## 5     5 248.8780  0.6207744 0.5294118 0.9705882
## 6     6 262.6641  0.6621423 0.5000000 0.9117647
## 7     7 219.7428  0.5459609 0.5294118 0.9411765
## 8     8 236.7492  0.5984812 0.5294118 0.9705882
## 9     9 304.6534  0.7693367 0.4705882 0.9411765
## 10   10 317.8155  0.8044497 0.3714286 0.9142857
## 
## $cv
##        mae mae_scaled within_50 within_95
## 1 262.3923  0.6600228  0.494958 0.9416807

Overall, it looks like model 3 has the lowest MAE, but model 1 has the highest within 50% range.

d- Below are the ELPD estimates:

# Calculate ELPD for the 4 models
set.seed(12345)
loo_1 <- loo(model1)
loo_2 <- loo(model2)
loo_3 <- loo(model3)
loo_4 <- loo(model4)

# Results
c(loo_1$estimates[1], loo_2$estimates[1], 
  loo_3$estimates[1], loo_4$estimates[1])
## [1] -1960.773 -2012.999 -1959.460 -1922.982
# Compare the ELPD for the 4 models
loo_compare(loo_1, loo_2, loo_3, loo_4)
##        elpd_diff se_diff
## model4   0.0       0.0  
## model3 -36.5       7.2  
## model1 -37.8       6.9  
## model2 -90.0      11.4

Based on the ELPD, model 4 is by far the “best” model in terms of posterior predictive accuracy.

e- I would argue that model 4 is the “best” model; while models 1 and 3 had better within 50% intervals and MAE, they were very very close numerically between all of the models, while the ELPD difference seems larger, so I am more willing to trust the ELPD results.

Exercise- tell a bit about the data you will use for next week’s presentation

I’m planning on using the dataset from my undergrad senior honors thesis on how breakups impact social media networks; I haven’t decided which of my analyses I want to re-do in Bayesian statistics, but I was thinking of doing one to two models on interpersonal electronic surveillance (IES)’s impact on breakup distress (would either be a normal or a gamma-poisson model), and maybe one model on digital possessions (deleting/untagging photos)’s impact on social media use (with another model adding in a “control” variable of age or long distance). This I’m not sure if it would be a beta-binomial or a gamma-poisson since the dependent variable (change in social media use) is a dummy mariable, so I could make it a 0-1 “probability of changing social media use” and have that be a beta-binomial model (which is what I believe is correct) or I could examine if I need to change the way the data is operationalized to apply it to a gamma-poisson model.